Accelerated data operations

ABSTRACT

Systems, methods, and software described herein facilitate accelerated input and output operations with respect to virtualized environments. In an implementation, upon being notified of a guest read process initiated by a guest element running in a virtual machine to read data into a location in guest memory associated with the guest element, a computing system identifies a location in host memory associated with the location in the guest memory and initiates a host read process to read the data into the location in the host memory that corresponds to the location in the guest memory.

RELATED APPLICATIONS

This application is a continuation of, and claims priority to, U.S.patent application Ser. No. 14/330,928, entitled “ACCELERATED DATAOPERATIONS IN VIRTUAL ENVRONMENTS,” filed on Jul. 14, 2014, which itselfclaims priority to U.S. Provisional Patent Application No. 61/845,479,entitled “ACCELERATED DATA OPERATIONS IN VIRTUAL ENVRONMENTS,” filed onJul. 12, 2013, both of which are hereby incorporated by reference intheir entirety.

TECHNICAL FIELD

Aspects of the disclosure are related to computing hardware and softwaretechnology and to accelerated data input and output in virtualenvironments.

TECHNICAL BACKGROUND

An increasing number of data-intensive distributed applications arebeing developed to serve various needs, such as processing very largedata sets that generally cannot be handled by a single computer.Instead, clusters of computers are employed to distribute various tasks,such as organizing and accessing the data and performing relatedoperations with respect to the data. Various applications and frameworkshave been developed to interact with such large data sets, includingHive, HBase, Hadoop, Amazon S3, and CloudStore, among others.

At the same time, virtualization techniques have gained popularity andare now common place in data centers and other environments in which itis useful to increase the efficiency with which computing resources areused. In a virtualized environment, one or more virtual machines areinstantiated on an underlying computer (or another virtual machine) andshare the resources of the underlying computer. However, deployingdata-intensive distributed applications across clusters of virtualmachines has generally proven impractical due to the latency associatedwith feeding large data sets to the applications.

OVERVIEW

Provided herein are systems, methods, and software for implementingaccelerated data input and output with respect to virtualizedenvironments. Data requested by a guest element running in a virtualmachine is delivered to the guest element by way of a region in hostmemory that is mapped to a region in guest memory associated with theguest element. In this manner, the delivery of data from a source towhere a guest element, such as a virtualized data-intensive distributedapplication, is accelerated.

In at least one implementation, upon being notified of a guest readprocess initiated by a guest element running in a virtual machine toread data into a location in guest memory associated with the guestelement, a computing system identifies a location in host memoryassociated with the location in the guest memory and initiates a hostread process to read the data into the location in the host memory thatcorresponds to the location in the guest memory.

This Overview is provided to introduce a selection of concepts in asimplified form that are further described below in the TechnicalDisclosure. It should be understood that this Overview is not intendedto identify key features or essential features of the claimed subjectmatter, nor is it intended to be used to limit the scope of the claimedsubject matter.

BRIEF DESCRIPTION OF THE DRAWINGS

Many aspects of the disclosure can be better understood with referenceto the following drawings. While several implementations are describedin connection with these drawings, the disclosure is not limited to theimplementations disclosed herein. On the contrary, the intent is tocover all alternatives, modifications, and equivalents.

FIG. 1 illustrates an operational scenario in an implementation.

FIG. 2A illustrates a method in an implementation.

FIG. 2B illustrates a method in an implementation.

FIG. 3 illustrates a computing architecture in an implementation.

FIG. 4 illustrates an operational scenario in an implementation.

FIG. 5 illustrates a server system.

TECHNICAL DISCLOSURE

Various implementations described herein provide for accelerated dataoperations in which data is provided for consumption to applicationsexecuting within virtual environments in a manner that enables big datajobs and other resource intensive tasks to be virtualized. Inparticular, data that resides externally with respect to a virtualenvironment can be read by guest elements executing within the virtualenvironment at a pace sufficient to allow for very data intensive jobs.This is accomplished by enhancing the read process such that data readfrom a source is written in host memory that is associated with guestmemory allocated to the guest elements.

In at least one implementation, a virtual machine is instantiated withina host environment. The virtual machine may be instantiated by ahypervisor running in the host environment. The hypervisor may run withor without an operating system beneath it. For example, in someimplementations the hypervisor may be implemented at a layer above thehost operating system, while in other implementations the hypervisor maybe integrated with the operating system. Other hypervisor configurationsare possible and may be considered within the scope of the presentdisclosure.

The virtual machine may include various guest elements, such as a guestoperating system and its components, guest applications, and the like,that consume and execute on data. The virtual machine may also includevirtual representations of various computing components, such as guestmemory, a guest storage system, and a guest processor.

In operation, a guest element running in the virtual machine isallocated a portion of the guest memory available in the virtualmachine. In normal operation, a guest element would initiate a readprocess to read data from a guest storage system in the virtual machine.The data would be read from the guest storage system and written to theportion of the guest memory allocated to the guest element.

It may be appreciated that when data is read or written in the virtualmachine, a non-virtual representation of the data is being manipulatedthrough the hypervisor in host memory or in a host storage system. Onekey task of the hypervisor in normal operation is to manage the readingand writing of data to and from host resources such that operationstaken within the virtual machine are reflected in the host resources.Accordingly, when a region in guest memory is allocated to a guestelement, a corresponding region in host memory, by implication, iseffectively allocated to the guest element. This is because hostresources such as memory or disk space are typically divided andallocated on a per-machine basis to each of multiple virtual machinesrunning on the host. This is in accordance with the principal ofisolation that allows each individual virtual machine to run inisolation from any others.

However, other applications, code, or components may also run on thehost, in addition to any virtual machines that may be instantiated.These host elements are also isolated from the virtual machines. Forexample, a region in host memory that is allocated to a host elementdoes not typically overlap with include a region in host memory that is,by implication, allocated to a guest element.

In the past, in order to feed data to big-data applications executing ina virtual machine, data would first have to be read from a source andwritten to a region in host memory from which it could be transferredinto the virtual machine. The data would then be transferred into thevirtual machine and written to a region in guest memory associated withthe consuming application. The multiple steps taken to get data to theexecuting applications made running data intensive tasks problematicwith respect to virtual environments.

In an enhancement, a more direct route for the data is proposed thatavoids at least some of the various aforementioned steps in the datatransfer process. Rather than reading data into a region in host memoryand then having to transfer the data into the virtual machine, data canbe read directly to a region in host memory that is mapped to a regionin guest memory associated with a consuming application or element. Inthis manner, data transfer steps are eliminated that previously maderunning big data applications on virtual machines an impracticalproposition.

In various scenarios, a specialized input/out driver is employed in thevirtual machine with which a guest application or other elementcommunicates to access files or other data on storage sub-systems. Theinput/output driver provides an abstracted view of a more complexstorage service or infrastructure to the guest application that allowsthe guest application to initiate reads and writes without modificationto any of its characteristics.

The input/output driver may appear to the guest application as any typeof guest storage sub-system, such as a disk drive, solid state drive,network attached storage, or other suitable storage. In actuality, theinput/output driver communicates with a translation module running inthe host environment to carry out the reads or writes initiated by theguest application. The translation module communicates with another hostelement or service capable of reading data from a source and writing thedata to an appropriate host memory location that is mapped to anappropriate guest memory location.

In a brief operational scenario, a memory mapping process is carried outto map regions in host memory to regions in guest memory allocated to aguest application. Once the mapping is accomplished, the translationmodule can translate the guest memory location identified in read orwrite requests (predominantly read requests) to an associated hostmemory location. Thus, data can be made accessible for consumption bythe guest application simply by reading it from its source into a hostmemory location. Because the host memory location is mapped to the guestmemory location, the guest application is able to access and process thedata without any further reads or writes of the data. In someimplementations, the region in host memory may be a region allocated tothe translation node and its processes, although the region may beallocated to other elements or processes.

FIG. 1 illustrates one representative operational scenario 100 in whichenhanced data operations are accomplished in accordance with at leastsome of the principals and ideas discussed above. In operationalscenario 100, a host environment 101 is illustrated in which a virtualmachine 111 may be implemented. Host environment 101 includes hostmemory 103, storage service 105, and translation node 107. It may beappreciated that host environment 101 may include other elements notshown for purposes of clarity. Host environment 101 may be implementedon any suitable computing system or systems.

Virtual machine 111 includes guest memory 113, guest element 115, anddata interface 117. Virtual machine 111 may include other elements notillustrated for purposes of clarity. It may also be appreciated that oneor more additional virtual machines may be instantiated and runningwithin host environment 101, in addition to virtual machine 111.

In operational scenario 100, a memory mapping process is carried out tomap guest memory 113 to host memory 103 (step 0). In this example, guestmemory 113 includes various blocks L, M, and N, which are representativeof different regions, locations, or other sub-divisions of guest memory113. Host memory 103 includes blocks U, V, W, X, Y, and Z, which arerepresentative of different regions, locations, or other sub-divisionsof host memory 103. In this scenario, block L is mapped to block U,block M is mapped to block V, and block N is mapped to block W.

Once guest memory 113 is mapped to host memory 103, enhanced dataoperations can commence to enable guest element 115 to consume data at apace sufficient for data-intensive application purposes. Continuing withoperational scenario 100, guest element 113 communicates with datainterface 117 to initiate a read process (step 1). As part of the readprocess, guest element 113 indicates to data interface 117 what data itis attempting to read, which in this scenario is represented by aportion A of data 106. The portion A of data 106 being read may be afile, a portion of a file, or any other type of data or sub-division ofdata 106. Guest element 113 also indicates to data interface 117 whereinguest memory the requested data should be deposited. In this scenario,block N is identified for the read process. It may be appreciated thatsome other element other than guest element 113 may determine andidentify the location in guest memory 113 into which the portion A ofdata 106 may be read.

Data interface 117 responsively communicates directly or indirectly withtranslation node 107 to notify translation node 107 of the read request(step 2). In at least one implementation, a request queue is maintainedby data interface 117 and shared with translation node 107 that detailsvarious read requests initiated by guest element 115 or any otherelement. Translation node 107 may monitor the request queue and handleread requests as they occur. Other mechanisms are possible, such asdirect calls between data interface 117 and translation node 107. Insome implementations, a complete queue is also maintained onto whichtranslation node 107 records which read requests have been completed.

Having been notified of the read request, translation node 107 proceedsto translate the location in guest memory 113 identified by the readprocess to its corresponding location in host memory 103 (step 3). Inthis scenario, block W is associated with block N. Translation node 107then advances the read process by communicating the subject of the readprocess and its destination to storage service 105 (step 4).

Storage service 105 may be any service capable of handling read or writeprocesses with respect to data 106. Storage service 105 may beimplemented by a host storage system with respect to a local storagedevice, such as a hard disk drive, solid state drive, or any attachedstorage. In addition, storage service 105 may be implemented by anyother element or collection of elements in host environment 101 that mayfacilitate interaction with other storage elements, such as a networkattached storage element or other remote storage facility.

Storage service 105 obtains the portion A of data 106 implicated by theread process, whether by reading from a disk drive or other localstorage device or possibly communicating with remote elements, andwrites the portion A of data 106 to block W in host memory (step 5).Accordingly, because block W is mapped to block N in guest memory 113,guest element 115 is able to access the portion A of data 106 withoutany other transfer operations needing to occur between host environment101 and virtual machine 111.

In at least one implementation, guest memory is mapped into a cnodeprocess space. A “cnode” as used herein refers to any applicationprogram or module suitable for implementing a translation node, such astranslation node 107. A kernel driver in a host environment performsthis mapping. The kernel driver is given the PID (process ID) of a qemuprocess (hypervisor process) which is spawned in a virtual machine.Using the PID, the kernel driver interrogates the kernel's internal datastructure for the qemu process searching all of its memory allocation.

Next, the kernel driver determines which allocation represents thevirtual machine's memory. The kernel driver pauses the qemu processwhile this memory allocation is mapped into the cnode space. Thismapping technique is similar to how a regular driver will map user spacememory for a direct memory access (DMA) (bus master) I/O operation. Insome implementations, memory may be considered double-mapped. Copies ofthe page tables for the mapping from the qemu process can be made andapplied to the cnode process.

The mapping may occur in two stages. In a first stage, the PID is passedto the kernel driver via an input/output control (IOCTL) call. Thedriver looks up the process in the kernel's internal data structure,finds the correct memory allocation and saves off this information inits own data structure. It returns an offset to the caller and a statusof success. The caller (cnode) will then use this offset to invoke astandard UNIX (or Linux) mmap system call. The kernel driver supportsthe mmap operation. The operating system will call the kernal driver'smmap entry points to do the actual mapping. Using the offset passed in,it (the mmap portion of the kernel driver) will look in its internaldata structure to find the information that it saved off from the IOCTLcall. It will then pause the qemu process, copy its user's pages for thememory allocation and use them to lock down the pages and create amapping to the cnode process.

It may be appreciated that mapping guest memory into host memory is anenhancement that enables accelerated data input and output as discussedherein. In some implications, this is accomplished by modifying thekernel API for locking a region into memory for a DMA I/O operation. TheAPI in some cases may be intended for use with a PCI device to be ableto perform bus mastering I/O operations. Instead, here the API is usedfor locking the memory down. Linux kernel structures are accesseddirectly in order to get the required page tables for performing themmap operation. The mmap operation is otherwise intended or aproprietary PCI device to be able to map its memory into a user processspace.

FIG. 2A illustrates a process 200A that may be carried out by datainterface 117 or any other suitable input/output drive. In operation,data interface 117 receives a read request from a guest application(step 201). The read request identifies target data for consumption bythe guest application and a location(s) in guest memory into which toread the target data. Data interface 117 advances the read process bycommunicating the read request to translation node 107 or some othersuitable host element (203). This may be accomplished by, for example,maintaining a shared queue of pending and completed read requests. Uponbeing notified that a read process has been completed, data interface117 informs the guest application of the same so that the guestapplication can access the data (step 205) in its location in guestmemory.

FIG. 2B illustrates a process 200B that may be carried out bytranslation node 107 or any other suitable host element. In operation,translation node 107 detects that a read process has been initiated(step 207). In response, translation node 107 translates the location inthe guest memory implicated by the read process into a correspondinglocation in host memory (step 209). With this new memory location,translation node 107 advances the read process (step 211). For example,translation node 107 may communicate with a storage sub-system orservice such that the target data can be read from a source and writtento the location host memory mapped to the designated location in guestmemory.

Referring now to FIG. 3, computing architecture 300 is representative ofan architecture that may be employed in any computing apparatus, system,or device, or collections thereof, to suitably implement all or portionsof host environment 101 and virtual machine 111, as well as process 200Aand 200B illustrated in FIG. 2, or variations thereof. Host environment101, virtual machine 111, process 200A, and process 200B may beimplemented on a single apparatus, system, or device or may beimplemented in a distributed manner Process 200A and 200B may beintegrated with the host environment 101 or virtual machine 111, but mayalso stand alone or be embodied in some other application.

Computing architecture 300 may be employed in, for example, servercomputers, cloud computing platforms, data centers, any physical orvirtual computing machine, and any variation or combination thereof. Inaddition, computing architecture 300 may be employed in desktopcomputers, laptop computers, or the like.

Computing architecture 300 includes processing system 301, storagesystem 303, software 305, communication interface system 307, and userinterface system 309. Processing system 301 is operatively coupled withstorage system 303, communication interface system 307, and userinterface system 309. Processing system 301 loads and executes software305 from storage system 303. When executed by processing system 301,software 305 directs processing system 301 to operate as described inoperational scenario 100 with respect to host environment 101 andvirtual machine 111 and their elements. Computing architecture 300 mayoptionally include additional devices, features, or functionality notdiscussed here for purposes of brevity.

Referring still to FIG. 3, processing system 301 may comprise amicroprocessor and other circuitry that retrieves and executes software305 from storage system 303. Processing system 301 may be implementedwithin a single processing device but may also be distributed acrossmultiple processing devices or sub-systems that cooperate in executingprogram instructions. Examples of processing system 301 include generalpurpose central processing units, application specific processors, andlogic devices, as well as any other type of processing device,combinations, or variation.

Storage system 303 may comprise any computer readable storage mediareadable by processing system 301 and capable of storing software 305.Storage system 303 may include volatile and nonvolatile, removable andnon-removable media implemented in any method or technology for storageof information, such as computer readable instructions, data structures,program modules, or other data. Examples of storage media include randomaccess memory, read only memory, magnetic disks, optical disks, flashmemory, virtual memory and non-virtual memory, magnetic cassettes,magnetic tape, magnetic disk storage or other magnetic storage devices,or any other suitable storage media. In no case is the storage media apropagated signal.

In addition to storage media, in some implementations storage system 303may also include communication media over which software 305 may becommunicated internally or externally. Storage system 303 may beimplemented as a single storage device but may also be implementedacross multiple storage devices or sub-systems co-located or distributedrelative to each other. Storage system 303 may comprise additionalelements, such as a controller, capable of communicating with processingsystem 301 or possibly other systems.

Software 305 may be implemented in program instructions and among otherfunctions may, when executed by processing system 301, direct processingsystem 301 to operate as described herein by operational scenario 100with respect to host environment 101 and virtual machine 111, as well asby process 200A and process 200B. In particular, the programinstructions may include various components or modules that cooperate orotherwise interact to carry out operational scenario 100, process 200A,and process 200B. The various components or modules may be embodied incompiled or interpreted instructions or in some other variation orcombination of instructions. The various components or modules may beexecuted in a synchronous or asynchronous manner, in a serial or inparallel, in a single threaded environment or multi-threaded, or inaccordance with any other suitable execution paradigm, variation, orcombination thereof. Software 305 may include additional processes,programs, or components, such as operating system software, hypervisorsoftware, or other application software. Software 305 may also comprisefirmware or some other form of machine-readable processing instructionsexecutable by processing system 301.

In general, software 305 may, when loaded into processing system 301 andexecuted, transform a suitable apparatus, system, or device employingcomputing architecture 300 overall from a general-purpose computingsystem into a special-purpose computing system customized to facilitateaccelerated data input and output with respect to virtualizedenvironments. Indeed, encoding software 305 on storage system 303 maytransform the physical structure of storage system 303. The specifictransformation of the physical structure may depend on various factorsin different implementations of this description. Examples of suchfactors may include, but are not limited to, the technology used toimplement the storage media of storage system 303 and whether thecomputer-storage media are characterized as primary or secondarystorage, as well as other factors.

For example, if the computer-storage media are implemented assemiconductor-based memory, software 305 may transform the physicalstate of the semiconductor memory when the program is encoded therein,such as by transforming the state of transistors, capacitors, or otherdiscrete circuit elements constituting the semiconductor memory. Asimilar transformation may occur with respect to magnetic or opticalmedia. Other transformations of physical media are possible withoutdeparting from the scope of the present description, with the foregoingexamples provided only to facilitate this discussion.

It should be understood that computing architecture 300 is generallyintended to represent an architecture on which software 305 may bedeployed and executed in order to implement operational scenario 100 (orvariations thereof). However, computing architecture 300 may also besuitable for any computing system on which software 305 may be stagedand from where software 305 may be distributed, transported, downloaded,or otherwise provided to yet another computing system for deployment andexecution, or yet additional distribution.

Communication interface system 307 may include communication connectionsand devices that allow for communication with other computing systems(not shown) over a communication network or collection of networks (notshown). Examples of connections and devices that together allow forinter-system communication may include network interface cards,antennas, power amplifiers, RF circuitry, transceivers, and othercommunication circuitry. The connections and devices may communicateover communication media to exchange communications with other computingsystems or networks of systems, such as metal, glass, air, or any othersuitable communication media. The aforementioned communication media,network, connections, and devices are well known and need not bediscussed at length here.

User interface system 309, which is optional, may include a mouse, avoice input device, a touch input device for receiving a touch gesturefrom a user, a motion input device for detecting non-touch gestures andother motions by a user, and other comparable input devices andassociated processing elements capable of receiving user input from auser. Output devices such as a display, speakers, haptic devices, andother types of output devices may also be included in user interfacesystem 309. In some cases, the input and output devices may be combinedin a single device, such as a display capable of displaying images andreceiving touch gestures. The aforementioned user input and outputdevices are well known in the art and need not be discussed at lengthhere. User interface system 309 may also include associated userinterface software executable by processing system 301 in support of thevarious user input and output devices discussed above. Separately or inconjunction with each other and other hardware and software elements,the user interface software and devices may support a graphical userinterface, a natural user interface, or any other suitable type of userinterface.

FIG. 4 illustrates memory mapping operational scenario 400 in which aportion of physical host memory is mapped to virtual memory. Inoperational scenario 400, at least a portion of addressable memory spacefor host memory 401 is illustrated. For simplicity, positions in hostmemory are addressable using 16-bit values represented in operationalscenario 400 by hexadecimal numbers ranging from 0000-FFFF with each hexvalue in the range addressing a position in memory. In practice,however, host memory 401 will likely include many more positions thancan be accessed by 16-bit values and, therefore, may include many morememory addresses that are addressable with values larger than 16-bit(e.g. 32 or 64-bit values).

In operational scenario 400, a guest element requires guest memory 402to operate. The guest element may be the operating system of a virtualmachine or may be a process running within a virtual machine. That is,guest memory 402 may be the entire amount of physical host memoryallotted to a virtual machine or may be an amount of virtual memoryallotted to a process running on the virtual machine, which is less thanthe total amount of host memory allotted to the virtual machine. In thisexample, the guest element is allocated 20,480 guest memory positionsillustrated by guest memory 402 as positions 0000-4FFF. Since the guestelement accesses guest memory 402 as though guest memory 402 is physicalmemory, the guest element will use addresses 0000-4FFF rather than theircorresponding addresses in host memory 401. These guest memory positionsare mapped, on a one-to-one basis, to their corresponding positions inhost memory 401 (step 0).

It should be understood that, while guest memory 402′s position withinhost memory 401 comprises a single block of sequential addresses, guestmemory 402 might comprise multiple non-contiguous address blocks.Similarly, guest memory 402 does not necessarily translate to a singlecontiguous block of addresses in host memory 402 but instead may beseparated into multiple address blocks within host memory 401.

After mapping, the guest process requests a read of data from a storageservice into the memory position addressed by 1400 (step 1). In otherwords, the guest process is going to use the requested data for somepurpose and, therefore, requests that the data be put into guest memoryfor access. As that read request is passed to the physical host machine,location 1400 within virtual memory 402 is translated to itscorresponding location in host memory 401, which will actually store thedata requested by the guest element (step 2). In this example, sinceguest memory 402 takes up a corresponding block of host memory 401,guest memory location 1400 is simply 1400 locations offset from location6000 in host memory 401. Thus, guest memory location 1400 corresponds tohost memory location 7400.

Once host memory location 7400 is determined, the requested data isretrieved from the storage service and is written to host memorylocation 7400 (step 3). As noted above, the data may take up more thanone location indexed from location 7400. After the data has been writtento location 7400, the guest element can then access the data by readingthe data out of location 1400 of guest memory 402 (step 4). Sincelocation 1400 corresponds to, and is essentially just an alternativeaddress for, location 7400 in host memory 401, the guest element'saccess of location 1400 simply accesses the data in location 7400 ofhost memory 401. Accordingly, during operation 400, the data requestedby the guest element is read directly into the memory location of hostmemory 401 from which the data will be accessed by the guest element.

It should be understood that an operation similar to operation 400 mightalso be used when the guest element requests to write data to a storageservice. For example, the guest element requests that the data inlocation 1400 of guest memory 402 be written to the storage service.Location 1400 is translated to location 7400 of host memory 401 and thedata in location 7400 is transferred to the storage service.

It should be further understood that, while the above operation refersto data read only into a single location (i.e. 1400 of guest memory402), the same principles apply to data read into multiple locations.That is, the requested data may include more information than can bestored in a single memory location and, therefore, needs to be spreadacross multiple locations. These multiple locations, whether or not theyare contiguous locations within guest memory 402 or host memory 401, aremapped between guest memory 402 and host memory 401 just as the singlelocation 1400 was above.

FIG. 5 illustrates a server system 500 that operates in accordance withthe embodiments described above. System 500 includes physical hostcomputing systems 501 and 502 each comprising a host physical memoryakin to host memory 401 of FIG. 4. Executing on one or more processorswithin host computing system 501 are cache service 560 and hypervisor550. Executing on one or more processors within host computing system502 are cache service 560 and hypervisor 551.

Hypervisors 550 and 551 are used to instantiate virtual machines521-524, as illustrated. Virtual machines (VMs) 521-524 are used toprocess large amounts of data and may include various guest elements,such as guest operating systems 541-544 and their components, guestapplications, and the like. The virtual machines may also includevirtual representations of computing components, such as guest memory(e.g. guest memory 402 from FIG. 4), a guest storage system, and a guestprocessor.

In this example, Job 571 is executing on VM 521 and VM 522, which isalso executing Job 572. Job 572 is further executing on VMs 523 and 524.Each of jobs 571 and 572 process data retrieved from data repository580, which may be a storage service accessible by both host 501 and host502 over a communication network, such as a local area network (LAN), orat least a portion of data repository 580 may be included in at leastone of hosts 501 and 502. Cache service 560 communicates with datarepository 580 to facilitate the supply of data to each VM 521-524though hypervisors 550 and 551. Since multiple virtual machines arerunning each of jobs 571 and 572, cache service 560 coordinates the useof data and host memory used by each instance. The coordination maydepend upon a variety of factors including quality of service for eachjob, amount of host memory available for each job, priority of each job,and the like.

Using the operations described above, each job executing on VMs 521-523is allocated a region of the guest memory allocated to each of VMs521-523. Similarly, job 572 is allocated a region of the guest memoryallocated to VM 524. These regions of virtual memory are mapped to theircorresponding regions within the memory of hosts 501 and 502,respectively. According, when a job requests that data be read intolocations within its allocated region of guest memory from datarepository 580, the locations of guest memory are translated tocorresponding host locations and cache service 560 reads the requesteddata from data repository 580 into the corresponding host locations. Thetranslation step may occur at any software level represented withinhosts 501 and 502 provided that the information necessary to map guestand host memory regions can be obtained at that level. Thus, thetranslations may occur at the cache service, within the hypervisor,within the VMs, or in some other software element executing in thestack.

Once the read process is complete, the requesting job can access thedata from the guest locations designated in the request. Advantageously,to increase efficiency of performing a job, the memory mapping from theexample above allows instances of the job to be spread across multipleVMs, both on a single host system and across multiple host systems,while maintaining a data access speed nearing what would be achieved ifthe job instances were each running natively on a host system. Withoutsuch memory mapping, the requested data would first be stored in anon-corresponding location within host memory and then moved to thelocation corresponding to the guest memory used by the requesting job.This extra move step takes time, which really adds up when processinglarge data sets.

The included descriptions and figures depict specific implementations toteach those skilled in the art how to make and use the best option. Forthe purpose of teaching inventive principles, some conventional aspectshave been simplified or omitted. Those skilled in the art willappreciate variations from these implementations that fall within thescope of the invention. Those skilled in the art will also appreciatethat the features described above can be combined in various ways toform multiple implementations. As a result, the invention is not limitedto the specific implementations described above, but only by the claimsand their equivalents.

What is claimed is:
 1. An apparatus comprising: one or more non-transitory computer readable storage media; and program instructions stored on the one or more non-transitory computer readable media, wherein the program instructions comprise a kernel driver that, when executed by a processing system, directs the processing system to at least: obtain a process identifier for a process running on a host; identify host memory allocated to the process using the process identifier; identify a portion of the host memory allocated to a virtual machine associated with the process; map a portion of the host memory allocated to the virtual machine to a translation node running on the host.
 2. The apparatus of claim 1 wherein the program instructions further comprise the translations node, wherein the translation node, when executed by the processing system, directs the processing system to at least: upon being notified of a guest read process initiated by a guest element running in the virtual machine to read data into a location in guest memory associated with the guest element, identify a location in the portion of the host memory mapped to the location in the guest memory; and initiate a host read process to read the data into the location in the portion of the host memory mapped to the location in the guest memory.
 3. The apparatus of claim 2 wherein the process comprises a hypervisor process.
 4. The apparatus of claim 3 wherein the hypervisor process comprises a QEMU process.
 5. The apparatus of claim 4 wherein the virtual machine is launched by the QEMU process.
 6. The apparatus of claim 2 wherein the guest element comprises an operating system of the virtual machine.
 7. The apparatus of claim 2 wherein the guest element comprises a process executing within an operating system of the virtual machine.
 8. An apparatus comprising: one or more storage devices; a processing system operatively coupled with the one or more storage devices; and program instructions stored on the one or more storage devices, wherein the program instructions comprise a kernel driver that, when executed by a processing system directs the processing system to at least: obtain a process identifier for a process running on a host; identify host memory allocated to the process using the process identifier; identify a portion of the host memory allocated to a virtual machine associated with the process; map a portion of the host memory allocated to the virtual machine to a translation node running on the host.
 9. The apparatus of claim 8 wherein the program instructions further comprise the translation node, wherein the translation node, when executed by the processing system, directs the processing system to at least: upon being notified of a guest read process initiated by a guest element running in the virtual machine to read data into a location in guest memory associated with the guest element, identify a location in the portion of the host memory mapped to the location in the guest memory; and initiate a host read process to read the data into the location in the portion of the host memory mapped to the location in the guest memory.
 10. The apparatus of claim 9 wherein the process comprises a hypervisor process;
 11. The apparatus of claim 10 wherein the hypervisor process comprises a QEMU process.
 12. The apparatus of claim 11 wherein the virtual machine is launched by the QEMU process.
 13. The apparatus of claim 9 wherein the guest element comprises an operating system of the virtual machine.
 14. The apparatus of claim 9 wherein the guest element comprises a process executing within an operating system of the virtual machine.
 15. A method comprising: in a kernel driver: obtaining a process identifier for a process running on a host; identifying host memory allocated to the process using the process identifier; identifying a portion of the host memory allocated to a virtual machine associated with the process; mapping a portion of the host memory allocated to the virtual machine to a different process running on the host.
 16. The method of claim 15 further comprising the different process: upon being notified of a guest read process initiated by a guest element running in the virtual machine to read data into a location in guest memory associated with the guest element, identifying a location in the portion of the host memory mapped to the location in the guest memory; and initiating a host read process to read the data into the location in the portion of the host memory mapped to the location in the guest memory.
 17. The method of claim 16 wherein the process comprises a hypervisor process;
 18. The method of claim 17 wherein the hypervisor process comprises a QEMU process.
 19. The method of claim 16 wherein the guest element comprises an operating system of the virtual machine.
 20. The method of claim 16 wherein the guest element comprises a process executing within an operating system of the virtual machine. 